Recognition of Structured Collocations in An Inflective Language

نویسندگان

  • Bartosz Broda
  • Magdalena Derwojedowa
  • Maciej Piasecki
چکیده

We present a method of the structural collocations extraction for an inflective language (Polish) based on the process divided into two phases: extraction and filtering of the pairs of wordforms reduced to baseforms and structural annotation of the extracted collocations with lexico-syntactic patterns. The parameters of the patterns are specified manually but their instances are generated and tested on the corpus automatically. The extracted collocations were evaluated by applying them as rules in morpho-syntactic disambiguation of Polish and by comparing them with a lists of two-word expressions extracted from two Polish dictionaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combination of a hidden tag model and a traditional n-gram model: a case study in czech speech recognition

A speech recognition system targeting high inflective languages is described that combines the traditional trigram language model and an HMM tagger, obtaining results superior to the trigram language model itself. An experiment in speech recognition of Czech has been performed with promising results. 1. Speech Recognition of Inflective Languages Inflective languages pose a hard problem in speec...

متن کامل

The Effects of Collaborative Versus Non-collaborative Massed and Distributed Presentation on the Comprehension and Production of Lexical Collocations

To investigate the effect of massed and distributed collaborative and non-collaborative presentation on L2 learners’ comprehension and production of lexical collocations, 105 participants at Takestan Islamic Azad University in 4 groups were assigned to four different treatment conditions (collaborative-massed; collaborative-distributed; noncollaborative-massed; and noncollaborative-distributed ...

متن کامل

The Effects of Collaborative and Individual Output Tasks on Learning English Collocations

  One of the most problematic areas in foreign language learning is collocation. It is often seen as arbitrary and an overwhelming obstacle to the achievement of nativelike fluency. Current second language (L2) instruction research has encouraged the use of collaborative output tasks in L2 classrooms. This study examined the effects of two types of output tasks (editing and cloze) on the learni...

متن کامل

Collocational Processing in Two Languages: A psycholinguistic comparison of monolinguals and bilinguals

With the renewed interest in the field of second language learning for the knowledge of collocating words, research findings in favour of holistic processing of formulaic language could support the idea that these language units facilitate efficient language processing. This study investigated the difference between processing of a first language (L1) and a second language (L2) of congruent col...

متن کامل

Speech Recognition of Czech-Inclusion of Rare Words Helps

Large vocabulary continuous speech recognition of inflective languages, such as Czech, Russian or Serbo-Croatian, is heavily deteriorated by excessive out of vocabulary rate. In this paper, we tackle the problem of vocabulary selection, language modeling and pruning for inflective languages. We show that by explicit reduction of out of vocabulary rate we can achieve significant improvements in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007